Cleaning Your Wrong Google Scholar Entries
نویسندگان
چکیده
Entity categorization – the process of grouping entities into categories for some specific purpose – is an important problem with a great many applications, such as Google Scholar and Amazon products. Unfortunately, many real-world categories contain mis-categorized entities, such as publications in one’s Google Scholar page that are published by the others. We have proposed a general framework for a new research problem – discovering mis-categorized entities. In this demonstration, we have developed a Google Chrome extension, namely GSCleaner, as one important application of our studied problem. The attendees will have the opportunity to experience the following features: (1) mis-categorized entity discovery – The attendee can check mis-categorized entities on anyone’s Google Scholar page; and (2) Cleaning onsite – Any attendee can login and clean his Google Scholar page using GSCleaner. We describe our novel rule-based framework to discover mis-categorized entities. We also propose effective optimization techniques to apply the rules. Some empirical results show the effectiveness of GSCleaner on discovering mis-categorized entities. Keywords-mis-categorized entity; Google Scholar cleaner; rule-based framework; signature
منابع مشابه
Data Quality Not Your Typical Database Problem
Textbook database examples are often wrong and simplistic. Unfortunately Data is never born clean or pure. Errors, missing values, repeated entries, inconsistent instances and unsatisfied business rules are the norm rather than the exception. Data cleaning (also known as data cleansing, record linkage and many other terminologies) is growing as a major application requirement and an interdiscip...
متن کاملA bibliometric study of Video Retrieval Evaluation Benchmarking (TRECVid): A methodological analysis
This paper provides a discussion and analysis of methodological issues encountered during a scholarly impact and bibliometric study within the field of computer science (TRECVid Text Retrieval and Evaluation Conference, Video Retrieval Evaluation). The purpose of this paper is to provide a reflection and analysis of the methods used to provide useful information and guidance for those who may w...
متن کاملOptimize Your Article for Search Engine
This article provides guidelines on how to optimize scholarly literature for academic search engines like Google Scholar, in order to increase the article visibility and citations.
متن کاملUsing "Cited by" Information to Find the Context of Research Papers
This paper proposes a novel method of analyzing data to find important information about the context of research papers. The proposed CCTVA (Collecting, Cleaning, Translating, Visualizing, and Analyzing) method helps researchers find the context of papers on topics of interest. Specifically, the method provides visualization information that maps a research topic’s evolution and links to other ...
متن کاملHunter X Scholar – Finger out Famous Men in Your Research Area
As the growth of the WWW, scientists and researchers publishing their research information on the web may become an essential comportment in academia, an enormous number of web pages provide information on scientists, research papers, and technical documents in the Internet and indexed by search engines. For a junior student or junior researcher, it is a nontrivial task to know/search authorita...
متن کامل